Task Scheduling using Block Dependency DAG of Block-Oriented Sparse Cholesky Factorizationy

نویسندگان

  • Heejo Lee
  • Jong Kim
  • Sunggu Lee
چکیده

The block-oriented sparse Cholesky factorization decomposes a sparse matrix into rectangular sub-blocks, and handles each block as a computational unit in order to increase data reuse in a hierarchical memory system. As well, the factorization method increases the degree of concurrency with the reduction of communication volumes so that it performs more eeciently on a distributed-memory multiprocessor system than the customary column-oriented factorization. But until now, mapping of blocks to processors has been designed in the direction of load balance within restricted communication patterns. In this paper, we represent tasks using a block dependency DAG, which shows the execution behavior of the block sparse Cholesky factorization. Since the characteristics of tasks for the block Cholesky fac-torization are diierent from those of the conventional parallel task model, we also propose a new task scheduling algorithm using block dependency DAG. The proposed algorithm consists of two stages: early-start clustering, and aaned cluster mapping. The early-start clustering stage is used to cluster tasks with preserving the earliest start time of a task without limiting parallelism. After task clustering, the aaned cluster mapping algorithm allocates clusters to processors considering both communication cost and load balance. Experimental results on the Fujitsu parallel system show that the proposed task scheduling approach outperforms other processor mapping methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout

We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block layout. The algorithm-byblocks approach induces a task graph for the factorization. These tasks are inter-related to each other through their data dependences in...

متن کامل

A Comparison of D and D Data Mapping for Sparse LU Factorization with Partial Pivoting

This paper presents a comparative study of two data mapping schemes for parallel sparse LU factorization with partial pivoting on distributed memory machines Our previous work has developed an approach that incorporates static symbolic factoriza tion nonsymmetric L U supernode partitioning and graph scheduling for this problem with D column block mapping The D mapping is commonly considered mor...

متن کامل

Block-Based Compressive Sensing Using Soft Thresholding of Adaptive Transform Coefficients

Compressive sampling (CS) is a new technique for simultaneous sampling and compression of signals in which the sampling rate can be very small under certain conditions. Due to the limited number of samples, image reconstruction based on CS samples is a challenging task. Most of the existing CS image reconstruction methods have a high computational complexity as they are applied on the entire im...

متن کامل

Parallel and fully recursive multifrontal sparse Cholesky

We describe the design, implementation, and performance of a new parallel sparse Cholesky factorization code. The code uses a multifrontal factorization strategy. Operations on small dense submatrices are performed using new dense matrix subroutines that are part of the code, although the code can also use the blas and lapack. The new code is recursive at both the sparse and the dense levels, i...

متن کامل

Parallel Asynchronous Modelization and Execution of Cholesky Algorithm using Petri Nets

PDPTA 2013 Parallelization of algorithms with hard data dependency has a need of task synchronization. Synchronous parallel versions are simple to model and program, but inefficient in terms of scalability and processors use rate. The same problem for Asynchronous versions with elemental static task scheduling. Efficient Asynchronous algorithms implement out-of-order execution and are complex t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999